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MEASURE-VALUED  PROCESSES  IN  THE 
CONTROL  OF  PART  I ALLY-OBSERVABLE  STOCHASTIC  SYSTEMS 


WENDELL  H.  FLEMING 


ABSTRACT 


This  paper  is  concerned  with  the  optimal  control  of  continuous- 
time  Markov  processes.  The  admissible  control  laws  are  based  on  white- 
noise  corrupted  observations  of  a function  on  the  state  processes.  A 
"separated"  control  problem  is  introduced,  whose  states  are  probability 
measures  on  the  original  state  space.  The  original  and  separated 
control  problems  are  related  via  the  nonlinear  filter  equation.  The 
existence  of  a minimum  fov  the  separated  problem  is  established.  Under 
more  restrictive  assumptions  it  is  shown  that  the  minimum  expected 
cost  for  the  separated  problem  equals  the  infimum  of  expected  costs 
for  the  original  problem  with  partially  observed  states. 
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MEASURE -VALUED  PROCESSES  IN  THE 
CONTROL  OF  PART IALLY -OBSERVABLE  STOCHASTIC  SYSIEMS 

Wendell  H.  Fleming 

1 . Introduction. 

We  are  concerned  with  optimal  control  of  partially-observable 
stochastic  systems,  of  the  following  kind.  The  state  (or  signal) 
process  is  denoted  by  xt,  0 < t < T,  with  xt  £ l where  £ is 

some  given  "state  space".  The  control  process  is  denoted  by  ut, 

0 < t < T,  with  ut  € U where  U is  some  given  "control"  space. 

The  control  ut  is  allowed  to  depend  on  observations  ys  for 

0 < s < t.  In  this  paper,  we  shall  assume  that 

(l.i)  yt  - joh(xs)ds  * V 

where  w^  is  a brownian  motion  process  of  some  dimension  v. 

The  object  is  to  minimize  a criterion  of  the  form  E4>(x^),  given 
an  initial  distribution  of  the  random  variable  x^. 

A precise  formulation  of  the  partially-observable  control  probl 
is  given  in  §2.  An  open  problem,  apparently  difficult,  is  to  prove 
the  existence  of  an  optimal  control  process  in  case  of  partial 
observations.  We  do  not  solve  this  problem  here.  Instead,  we 
introduce  a related  control  problem  in  §3,  which  we  call  the 
"separated"  problem.  In  the  separated  problem  the  "state"  at 
time  t is  a probability  measure  *t  on  E.  The  state  process  is 
governed  by  a stochastic  partial  differential  equation,  driven  by 
some  v-dimensional  brownian  motion  b^  (see  (3.1)  for  this 


2 


equation  written  in  a weak  form).  In  the  separated  problem,  the 
controller  is  allowed  (roughly  speaking)  complete  past  observa- 
tions in  choosing  the  control  ut.  See  §3  for  the  precise 
formulation.  The  objective  is  to  minimize  Ehtt ( ♦ ) , given 
where 

(1.2)  n(g)  = f g (x) dn (x)  . 

J £ 

The  original  control  problem  with  partial  observations  and 
the  separated  problem  are  related  through  the  nonlinear  filter 
equation  (2.5),  which  is  the  same  as  equation  (3.1)  if  h is  the 
conditional  distribution  of  x^.  given  past  observations  and 
bt  « wt  is  the  innovation. 

In  §4  we  establish  some  tightness  and  closure  properties 
associated  with  the  separated  problem.  Then  we  prove  a result 
about  the  existence  of  a minimum  for  the  separated  problem 
(Theorem  1).  The  method  is  an  adaptation  of  [4].  If  we  let  »s 
denote  the  minimum  of  E^T (<1>)  in  the  separated  problem  and  a 
be  the  infimum  of  E4>(x^.)  in  the  original  problem,  then  the 
nonlinear  filter  equation  implies  that  «s  < a.  In  §9  we  show 
that  * u,  under  fairly  restrictive  assumptions  (Theorem  3). 

A result  like  Theorem  3 was  proved  by  Bismut  [2]  when  £ is  a 
finite  set,  under  still  more  restrictive  conditions. 

A separated  control  problem  with  state  space  £ was  also 
considered  by  Segall  [10] . He  considered  both  observations  of 
the  type  (1.2)  and  point  observations.  A nonlinear  semigroup 
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approach,  when  £ is  finite,  was  taken  by  Davis  13] . 

Another  case  of  considerable  interest  is  when  the  state  process 
xt  obeys  a stochastic  differential  equation 

(1.3)  dxt  - f(xt,ut)dt  ♦ dwt, 

I 

where  w{  is  a brownian  motion  independent  of  the  brownian  motion 
w(  in  (1.2).  Unfortunately,  our  results  do  not  include  this  case. 
A minor  difficulty  is  that  the  state  space  Z is  some  euclidean 
space,  which  is  not  compact  as  assumed  in  §2.  A more  significant 
difficulty  is  that  the  generator  associated  with  (1.3)  when 

u is  a constant  control  is  an  unbounded  operator.  The  method  used 
to  prove  Theorem  3 would  have  to  be  changed  to  deal  with  this  case. 
It  is  hoped  that  the  device  of  introducing  the  separated  problem 
may  eventually  be  useful  to  study  existence  of  optimal  controls  for 
the  partially  observed  control  problem. 


2 . The  Control  Problem  with  Partial  Observations. 

Throughout  the  paper  we  assume  that  x^.  € Z,  where  Z is  a 
compact  metric  space;  moreover,  ut  € U,  where  U is  a compact,  convex 
subset  of  euclidean  Ru  for  some  u.  In  (1.1)  we  assume  that 
h € C(Z;R  );  moreover,  in  the  criterion  to  be  minimized  $ £ C(Z), 
where  C(Z)  = C(Z;R*)  is  the  space  of  continuous  real-valued 
functions  on  Z. 


We  assume  that  for  each  constant  control  u € U there  is  a 


semigroup  9 “ 
x t . Let 


We  assume  that: 
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(2.1)  There  is  a dense  set  i/  c C (I ) such  that  L/  c i/(j/u) 

for  all  u € U.  Given  r € £/,  yu$  £ C(U;C(E)). 

(2.2)  Given  r £ C(I)  and  t > 0,  ./1,r(x)  £ C(i:  * U). 

Let  0([0,T];£)  denote  the  space  of  51-valued  functions  which 
are  riRht  continuous  and  have  left  hand  limits  for  each  t € (Q,TJ; 
see  1 1 ) . 

Admissible  systems  (P.O.).  Let  x,u,w  be  processes  defined 
on  some  probability  space  (ft,.**,  { V ),p),  provided  with  an  in- 
creasinR  family  ( } of  c-algebras,  0 < t < T.  For  brevity  we 

write  x for  the  process  instead  of  xt>  0 < t < T,  etc.  We  re- 

quire that  x^  is  -measurable,  and  that  w is  a brownian 
motion  adapted  to  ( .r'  1 . Moreover,  the  paths  x.(>*>)  are  in 
D(lO,T);E)  for  each  « € il.  For  r £ i/  let 

0 ft  u 

U-3)  '»£  - r l x t ) - r ( x q ) - ' r ( x ) d s . 

Let  be  the  o-algebra  generated  by  ys»  0 < s < t. 

Definition.  We  say  that  (x,u,w)  is  an  ad m i s s i b 1 e s y s t e m 
(P.O.)  if: 

(i)  u(  is  measurable,  for  0 < t < T. 

(ii)  For  each  r € i/,  m^  is  a { .^ ) -martingale  and 


vms,w>t  ■ 0. 


s 


We  recall  that  the  condition  <mR,w>t  =0  is  equivalent  to 
requiring  that  is  an  ( -martingale  [ 8 J . The  partially 

ohserved  control  problem  is  to  find  an  admissible  system  (x,u,w) 
minimizing  E<t> (x , given  <fr  and  the  distribution  of  the  initial 
state  Xq. 

Since  x has  right  continuous  paths  and  w has  continuous 
paths,  m^  and  m^wt  are  also  martingales  with  respect  to  { ♦ } . 

Hence,  we  may  assume  that  { 9^ ) is  a right  continuous  family. 

In  addition,  we  may  complete  these  o-algebras  by  adjoining  P-null 
subsets  of  9. 

In  the  special  case  of  a constant  control  u,  we  can  let  x 
be  a Markov  process  associated  with  the  semigroup  9^. 

The  nonlinear  filter  equation.  Let  be  a regular  conditional 

distribution  for  xt  given  9^.  Given  g e C(S) 

(2.4)  nt (g)  - E[g(xt)|  $rj). 

Since  is  the  trivial  o-algebra,  is  the  distribution  of 

Xq.  The  nonlinear  filter  equation  [7,  Theorem  8.1],  for  g € 
is 

ft  u 

(2.5)  ut(g)  * n0(g)  + Jq^s^  ~g)ds 

4 |^1Ts(8h)  * Vg)Vh)1*dV 


where 


I 
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is  the  innovation  process.  Note  that  gh  and  h have  values 

V 

in  R , and  w^  is  a v-dimensional  brownian  motion  adapted  to 
{^^}.  From  (2.4)  with  g = 4>  we  have 

(2.7)  E4>(xt)  = E{E<Kxt)  | = EwT(<t). 

3.  The  Separated  Control  Problem. 

Let  JC  - jt(Z)  be  the  space  of  probability  measures  on  the 
compact  metric  space  £.  We  give  JC  the  w*-topology;  then  JC 
is  compact,  metrizable.  In  the  separated  problem  the  "state" 
process  is  measure -valued , with  n £ JC . We  define 

admissible  systems  for  the  separated  problem  as  follows.  Let 
tf,u,b  be  processes  defined  on  some  probability  space  (fi,  $ , { ),P), 
provided  with  an  increasing  family  { & } of  o-algebras,  0 < t < T. 

We  require  that  nt  and  ut  are  ^-measurable,  and  that  bt  is 
a brownian  motion  of  dimension  v adapted  to  { } 

Definition.  We  say  that  (n,u,b)  is  an  admissible  system  (S) 
if,  for  each  g € 2> , 

(3.1)  *t(g)  - *0(g)  + j^Cs/^ds  + jo[Vgh)  ‘ Vg)Vh)]'dV 

The  separated  control  problem  is  as  follows.  Given  <J>  € C(E) 
and  € JC , find  an  admissible  system  (n,u,b)  minimizing  E^t(4>). 
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We  emphasize  that  the  separated  problem  is  defined  without  reference 
to  the  partially  observed  problem  in  §2.  However,  equations 
(2.5)-(2.7)  imply  the  following  relationship  between  the  partially 
observed  and  separated  problems.  Given  let 

(3.2)  a * inf (E4>(xT) : (x,u,w)  admissible  (P.O.),  Xq  has  distribution  i>q} 

(3.3)  = inf  {E^T(4>)  : (n,u,b)  admissible  (S) , n given) 

If  (x,u,w)  is  admissible  (P.O.),  let  be  a regular  con- 
ditional distribution  of  xt  given  let  ^ 

(fJ,  y,P)  = (fl,^, P).  Then  (n,u,w)  is  admissible  (S).  By  (2.7) 

and  definition  of  a , 

s 


as  1 Ewt(*)  = E*(xt). 

Since  this  is  true  for  all  systems  admissible  (P.0.) 

(3.4)  ag  < a. 


In  §9,  we  will  show  that  as  = a under  the  restrictive  assumptions 
that  the  generators  are  bounded  operators,  and  that  the 

control  u enters  linearly. 


4 . Tightness;  Closure  Properties  (Separated  Problem). 

If  (n,u,b)  is  an  admissible  system  (S) , then  by  (3.1)  1It  (g) 
is  continuous  on  [ 0 , T ] for  each  fixed  g € The  same  is  true 
for  g £ C(E) , since  is  dense  in  C(S)  and  ^t(Z)  ■ 1. 


Since  Jl  has  the  w*-topology,  the  measure-valued  process  J(  has 
paths  in  CQ0,T]  \J[) . 

Consider  any  collection  % of  admissible  systems  0,u,b). 

Let  us  show  tightness  of  the  corr e :ponding  collection  of  probability 
distributions  of  (*,b),  which  are  measures  on  C([0,T];^r)  * 
C((0,T];Rv).  This  is  Lemma  2 below.  Let  us  write  Ti(g)  for  the 
sample  path  ^tCg),  0 < t < T. 


Lemma  1 . For  every  g € e > 0 there  exists  a compact  set 
Beg  c C((0,TJ)  such  that 


P(".(g)  e B£g)  >l-c. 


Proof.  By  (3.1) 


wtCg)  - \(g)  = F (t)  - F (r)  + M ft)  - M (r). 


r t u 

Fg(t)  = SS)ds 

rt 

M„(t)  - ["  Cgh)  - if,(g)i».(h)]-db_. 

S J 0 b s s s 

We  have  |irs(i^Ug)|  < |li^Ug||  5 Kg,  by  assumption  (2.1)  and  compact' 

ness  of  the  control  space  U.  Hence,  F (•)  is  Lipschitz  with 

© 

constant  K , and  F (0)  = 0.  Moreover,  M (t)  is  a martingale 

® o 

with  increasing  process 

<Mg(t)>  = |t|1Ts(gh)  ‘ Trs(g)7rsCh)|2ds- 


MW 
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Since  |"s(gh)  - "..(g)  "s(h)  | 2 < K^.we  have  <Mg(-)>  Lipschitz 


with  constant  . From  these  facts,  Lemma  1 follows  by  well-known 

arguments  [9,  Proposition  9] (5,  Lemma  4]. 

Lemma  2 . For  every  e > 0 there  exist  compact  sets 
Ael  c C([0,T]  ;»  and  At2  c C(I0,T];RV)  such  that 

PC".  £ Atl)  > 1 - e,  P(b.  € A£2)  > 1 - e. 

The  existence  of  A^^  follows  from  Lemma  1 and  elementary 
properties  of  the  w*-topology  on  J[\  see  [5,  Lemma  3].  The 
existence  of  A£2  is  a known  property  of  brownian  motion. 


Closure  results.  Let  us  consider  sequences  of  admissible 

systems  (nn,un,bn),  n = *»2 all  defined  on  the  same 

(H,  ,P)  . If  this  sequence  has  a limit  (n,u,b),  in  a suitable 

sense,  we  wish  to  give  conditions  under  which  (n,u,b)  is 

admissible.  In  the  first  closure  result  we  consider  constant 

controls  u . i u . 

nt  n 

We  recall  that  Jt  with  the  w*-topology  is  metrizable.  Hence, 
one  can  consider  uniform  convergence  on  [ 0 , T]  of  sequences  n 
This  is  equivalent  to  uniform  convergence  of  nnt(g)  to  ^(g) 
for  each  g € C(E) . 

Lemma  3.  Let  C7rn»un»bn)  be  admissible  (S)  , n = 1,2,..., 

with  u„  € U a constant  control  such  that  u ■+•  u as  n -*•  ». 
n • — n — 

Suppose  that  -*•  b^  -►  b^.  uniformly  on  [0,T]  as^  n 


with  probability  1.  Then  (*,u,b)  is  admissible  (S) . 


Proof . For  each  g e £# 


"ntcs)  ■ >n0(g)  * r*n5cvUng)ds  * r. 


db  , 
ns  ns  ’ 


c = it  (ghl  - n (g) ti  (hi. 
ns  nsVb  J nsvt>-’  ns'-  J 


We  have  since  n (L'l  = 1 

ns  J 


u 

By  (2.1),  I | y ng  - y Ug||  -*■  0;  and  since  € C(E), 

1Tns(y'Ug1  ns(i^Ug)  uniformly  on  10, T]  with  probability  1. 
Hence,  with  probability  1 


1 im 
n-*-00 


u 


(y  ng)ds 


ft  u 
ws(y  s)ds, 

'0 


0 < t < T. 


Moreover,  ens  is  uniformly  bounded  and  tends  uniformly  on  [0,T] 
to  e s = ns(gh)  - ns (g) Tis (h) , with  probability  1. 

By  Lemma  1,  given  e > 0 there  exists  a compact  set 
De  <=  C([0,T);R  ) such  that  P(en.  e D£)  > 1 - e.  Since  compact  sub- 
sets of  C([0,T];RV)  are  cquicontinuous , it  follows  by  using 
piecewise  constant  approximations  that  as  n » 


I 


in  probability. 
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See  [4,  pp.  789-790].  Therefore,  0,u,b)  satisfies  (3.1)  for 
each  g € £/>,  which  shows  that  (.n,u,b)  is  admissible.  This  proves 
Lemma  3. 

Let  us  now  establish  a second  closure  result,  which  will  be 
used  in  §5  for  the  proof  of  an  existence  theorem  for  an  optimal 
control  in  the  separated  problem.  We  recall  that  u = (u1,...,uM)  € U, 
where  U c . We  now  impose  the  following  assumption  on  the 
form  of  the  generators  J/'u: 

(4.1)  it  = + i^-u,  where  i/°:  + C(I), 

: ^(i?^1)  C(E;Ry)  are  linear  operators  with  3 <=■  ^(J^), 

i = 0,1. 

Note  that  (4.1)  implies  (2.1). 

When  (4.1)  holds,  let 


ft 

v.  = u ds . 

1 0 s 

If  (n,u,b)  is  an  admissible  system  (S) , let  us  call  (n,v,b)  an 
admissible  system  (S').  Equation  (3.1)  can  now  be  rewritten  as 


(4.2) 


V*) 


b 0 

g)ds  + 

0 s 


Vg)*sCh»  ‘dbs- 


Thus,  the  conditions  that  (*,v,b)  be  admissible  (S')  are  that 
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Kt»vt  be  .^.-measurable,  that  b^  be  a brownian  motion  adapted 
to  { yt)»  that  (4.2)  hold  for  every  g € , and  that  ut  ■ dvt/dt 

is  in  U almost  everywhere  on  [0,T]  with  probability  1. 

Since  U is  compact,  |us|  < N for  some  N.  Hence,  v.  is 
Lipschitz  with  constant  N.  Since  v0  = 0,  v.  lies  in  a fixed 
compact  subset  of  C((0,T];Rw).  We  then  have  by  Lemma  2: 

Lemma  4 . l:or  every  c > o there  exists  a compact  set 
A c c C ( [ 0 , T ] ; J(  * Ru  x RV)  such  that  I* ( ( n . , v . , b . ) € A t ) > 1 - e. 

The  second  closure  result  is: 


Lemma  S . Let  be  admissible  (S'),  n ■ 1,2,...  . 

Suppose  that  (4.1)  holds  and  that  (n  . .v  . .b  .)  -►  (n  .v  .h  ) 
v nt*  nt'  nt'  v t'  t t' 

uniformly  on  1 0 , T ] as  n °°,  with  probability  1.  Then  (n,v,b) 
is  admissible  (S'). 


Proof.  Consider  any  g € <J>.  Since  ^’°g  € C(S), 
irns^'°8^  ■*“  ns(-^°8)  uniformly  on  (0,T).  Similarly 

" (i^g)  n_(^’1g)  uniformly  on  (0,TJ  . Since  v v uni- 

113  a ns  s 

formly  on  ( 0 ,T]  and  |dvnt/dt|  < N,  we  have 


1 im 

n>co 


1 im 
n-*-°° 


|0"nsc-^°e)ds  ' |0"s(y'°*)<is 


The  same  proof  as  Lemma  3 then  shows  that  (n,v,b)  satisfies  (4.2), 


! 
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for  each  g € (J . Finally,  since  unt  - dvn/dt  is  in  U,  U is 
compact  and  convex,  and  v t vt  uniformly  on  [0 ,T]  , we  have 
dv  /dt  € U.  Hence,  (n,v,b)  is  admissible  (S’). 

V 

5 . An  Existence  Theorem  (Separated  Problem). 

In  order  to  show  that  there  is  a minimum  in  the  separated 
problem,  we  show  that  there  is  an  admissible  system  (S’)  for  which 
the  infimum  in  (3.3)  is  attained.  This  will  be  proved  using 

results  in  §4,  following  the  method  of  [4].  The  distribution  of 
a triple  ( « , v , b ) is  a probability  measure  on  C((0,T);  x RV) . 

Triples  (^,v,b),  (^,v,b)  with  the  same  distribution  measure  aro 
identical  in  distribution. 

Theorem  1.  Suppose  that  (4.1)  holds.  Then  there  exists  an 
admissible  system  (S')  (n,v,b)  such  that  EtL.(4>)  - <*  . 

Proof.  Let  (n  ,vn,b  ) be  a minimizing  sequonce  (S');  thus 
^nT^  - °s  ant*  *■  as  as  11  00 • By  Lemma  4 and 

Skorokhod's  theorem,  there  exist  a subsequence  of  n and 

(TT  v ,l>  ) identical  in  distribution  with  (t  ,v  ,b  ) such  that 
nnn  nnn 

"nt’^nt’^nt  teni*  t0  limits  \,vt,Ft  uniformly  on  ( 0 , T ] , with 
probability  1.  By  Lemma  5,  (n’.v.F)  is  an  admissible  system  (S’). 
Moreover 

as  " 11m  " Btt(*). 

This  proves  Theorem  1. 


. ....  _ . . ..... 
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6 . Constant  Controls. 

The  remainder  of  this  paper  is  concerned  with  the  relationship 
between  the  infima  a and  in  (3.2),  (3.3).  For  this  purpose, 

we  consider  piecewise  constant  controls  in  §'s  7 and  8.  In 
preparation,  let  us  suppose  in  this  section  that  u is  a constant 
control,  ut  = u for  0 < t < T. 

In  §'s  6-8  we  do  not  use  linearity  of  in  u in  (4.1). 

Instead,  we  make  the  general  assumptions  (2.1)  and  (2.2). 

Lemma  6.  Given  u £ U,  and  a brownian  motion  b,  there 

exists  an  ^-valued  process  t1  which  is  a solution  to  (3.1)  for  all 
g £ Moreover,  the  distribution  of  n is  unique  (it  depends 

only  on  and  u ). 

Lemma  6 follows  from  results  of  Kunita  [6]  and  Szpirglas  [11] . 
Kunita's  construction  [6,  p.  374]  gives  uniqueness  in  distribution 
to  the  corresponding  equation  for  ^ written  in  terms  of  the 
semigroup  on  C(S)  generated  by  i^u . In  [11,  Th.  111.1] 

Szpirglas  showed  that  that  equation  is  equivalent  to  (3.1). 

Lemma  7.  Given  € Jt,  u £ U,  and  F € C (jt) , let 
'P  (tTq  ,u;  F,T)  = EF(ti^).  Then  'P  is  continuous  on  J(  * U. 

Proof.  Let  irn(J  - 7rQ,  un  + u (ur  £ U)  ; and  let  (VVV 
be  admissible  (S)  with  nn0  the  state  of  the  process  when 

t = 0.  By  Lemma  2 and  Skorokhod’s  theorem,  there  exist  (i^jF  ) 
identical  in  distribution  to  (nR,bn)  an<^  a subsequence  of  n 


such  that  un  -*•  u,  (nnt.&nt)  ♦ O^.bj.)  uniformly  on  [0,T]  with 
probability  1.  By  Lemma  3,  (n,u,b)  is  admissible  (S) . Moreover, 

*0  a . Then 

lim  V("  0>u  ;F,T)  = lim  EF(¥  ) = EF(¥T)  = H%,u;F,T). 

n->oo  11  11  n~^°°  11 1 1 u 

This  proves  Lemma  7. 

Note  that  in  defining  'I' (nQ»u;F,T) , we  have  used  the  uniqueness 
in  distribution  of  which  is  implied  by  Lemma  6. 

From  Lemma  7 and  compactness  of  jt  * U we  have: 

Corollary.  V(n)  = min  ^(n,u;F,T)  is  continuous  on  jt . 

u€U 

7 . A-Admissible  Systems  (S) . 

In  this  section  and  in  §8,  we  let  A denote  a fixed  partition 
of  [0,T]  into  subintervals  [tk,tk+i],  with  0 = tQ  < tj  < ...  < 
tm  = T.  We  define  Vk(n)  by  backward  induction  on  k.  For  n € Jt 

(7.1)  Vm(Ti)  = *(*) 

(7.2)  Vk(ir)  = min  * ("  ,u ; Vfc+1 , tk+1  - tfc) , k - 0,1,  ...,m  - 1. 

By  the  Corollary  in  §6,  Vk  6 

Equation  (7.2)  is  a discrete-time  dynamic  programming  equation 
for  the  separated  control  problem,  with  constant  control  on  each 
interval  [ tk , tk+ i ) , in  a sense  which  we  shall  indicate  below. 


Definition.  An  admissible  system  (n,u,b)  is  A-admissiblc  (S) . 
if  ut  is  constant  for  t^  < t < tk+J,  k ■ 0,  - 1. 

Wo  recall  that  an  admissible  system  IS)  is  defined  on  some 

in,  y.i  yt>,p). 

Lemma  8 . If  (n,u,b)  is  A-admissiblc.  then 


Vk(1,t  > 5 E{itt(*)  | yt  } , P - a.s. 
k k 


Proof.  We  use  backward  induction  on  k.  For  k = m,  t ■ T. 
m 


W “ nT(4,)  " &t}>  * ' a,s 


Suppose  Lemma  8 is  true  for  k + 1 . Then 


(*)  E{n  (*)|  y } - E{E{nT(*)|  y } | * } > E{V,  . (n.  )}<?  } 

Let  4 denote  0,u,b)  restricted  to  » anc*  let 

rk  " r ^ (<** ; • ) be  a regular  conditional  distribution  of  this  triple 

given  . With  P-probability  1,  ut  is  a constant  u^  on 

ltk»tk+i)  and  n is  constant  r^-almost  surely.  Moreover,  the 

k 

restriction  of  bt  to  [ is  a T^-brownian  motion.  Let 


G(t,t)  - nt (g)  * "t  (8)  - | kR)ds 

k Jtk  ' 

- | t"s(gh)  - Cg) TTS  (h)  ] * dbs . 
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Let  G(t)  * max  | G ( 4 , t ) | . By  (3.1),  G(C)  = 0,  P-almost  surely. 

Hence,  G(C)  ■ 0,  r^-almost  surely,  with  P-probability  1.  With  respect 
to  the  measure  1^,4  is  a solution  of  G(t,t)  = 0 on 
^k^k+l3  * for  each  g € £0.  Hence,  P-almost  surely 

E,vk.l[,tw>l  ■ *t*t  ■V'W'k.l  - V 


-V\>' 


This,  together  with  (*) , proves  Lemma  8. 

Since  ^ is  given  (not  random)  Lemma  8 implies  when  k = 0 


(7.3) 


W 5 


(7.4)  = inf {E^T(4>) : (n,u,b)  A-admissible  (S) , it  given} 


Then  (7.3)  implies  that  V0(ttq)  < a^.  in  fact,  Vn(*n)  = aj.  This 


o- 


follows  from  Theorem  2 in  the  next  section.  A direct  proof  that 
V0(V  ■ as  could  also  be  given  in  terms  of  the  separated  problem 
only  without  reference  to  admissible  systems  (P.O.);  but  we  shall 
not  do  so. 

In  a similar  way,  V^(Tt^)  is  the  infimum  of  EifT(4>)  for  a 
separated  problem  on  [t^,T],  using  controls  constant  on  intervals 
^ > k»  and  with  This  justifies  calling  (7.2) 

a discrete-time  dynamic  programming  equation. 


18 


8.  A-Admissible  Systems  (P.O.).  As  in  §7,  let  A be  a fixed 
partition  of  [0,T] . 

Definition.  An  admissible  system  (x,u,w)  is  A-admissible  (P.O.) 
if  ut  is  constant  for  tk  < t < tk+1,  k = 0,1,..., m - 1. 

Given  € Jt  let 

(8.1)  uA  = inf  E^Cx-p):  (x,u,w)  A-admissible  (P.O.),  x has 
distribution  Tip } . 

As  in  (3.4)  we  have  The  purpose  of  the  present 

section  is  to  prove: 

Theorem  2.  aA  = aA  = V„(rc„). 

s 0 0 

Since  Vq^q)  < it  is  sufficient  to  prove  that,  for 

any  e > 0,  there  exists  (x,u,w)  A-admissible  (P.O.)  such  that  xQ 
has  distribution  and 

(8.2)  E{*(xt)}  < V0(ttq)  + e. 

This  follows  from  Lemma  10  below. 

We  begin  with  the  following  construction,  similar  to  one  used 
by  Bensoussan-Lions  (121.  Let  be  disjoint,  Borel 

measurable  subsets  of  A,  with  Jt  - ^ u . . . U Let  ukj  € U, 

k = 0,1,..., m - 1,  j = 1,...,£.  Given  an  initial  distribution 
P°r  xg,  we  wish  to  construct  a A-admissible  system  (P.O.)  (x,u,w) 

with  the  property 
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(8.3)  ut  = ukj  if  € 0y  tk  < t < tw,  k ■ 0,1 • 1, 

where  is  a regular  conditional  distribution  of  x given  5?J. 

The  system  (x,u,w)  will  be  defined  on  the  "canonical"  sample 
space 


n = D([0,T] ;I)  x C C 1 0 , T ] ;RV) , 

whose  elements  we  denote  by  x,w>  . Let  & be  generated  by 
xi,w>  paths  for  0 < s < t,  with  & = 9^.  We  define  by  in- 
duction a sequence  of  probability  measures  Pq,P^,...,P  ^ as 

follows;  then  we  take  P = P ..  The  measure  P.  will  be  defined 

m- 1 k 


on 


k+1 


Let  ^xk  be  tbe  Probability  distribution  on  D( [tk,tk+1J ;E) 

of  a Markov  process  with  initial  state  x = x and  generator  Jz^u , 

k u 

From  assumption  (2.2)  and  the  Markov  property  Qxk  depends  con- 
tinuously on  (x,u)  € E * U in  the  sense  of  convergence  of  finite 
dimensional  distributions.  Let  Wwk  be  Wiener  measure  on 

C([tk,tk+i];R  ) for  paths  starting  at  wt  = w. 

k 

For  0 < t < t^,  the  control  is  constant:  ut  = uQ  = Uqj  for 
that  j such  that  € &y  We  define  Pq  on  ^ as  the 
product  measure  PQ  = Qq  x Wqo,  where 

VB>  ' |E9“SCB)d»0Cx).  B £ JFV 

Now  suppose  that  p0»pi » • • • »Pk-i  have  been  defined,  as  well 
as  piecewise  constant  controls  ut  for  0 < t < t^.  As  in  (8.3), 


. 
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we  define  ut  = if  e , tk  < t < tk+1,  where  7Tt  is 

k k 

a regular  conditional  distribution  of  x.  given  9*  with 

k fck 

respect  to  the  measure  Pk.1-  Let  (x|,w|),  0",w")  denote  the 
restrictions  to  [ 0 , tkJ  and  Ctk,tk+1]  respectively  of  (x  ,w  ). 
The  measure  is  defined  first  for  subsets  of  ft  the  form 

B'  x B",  where  B'  € 9'  is  generated  by  (x',w')  paths  and 

k • 

B is  a "window  set"  of  the  form  B"  = {(x",wM):  (x  w 1 € A.} 

• • v s . * s . i 3 

for  finitely  many  s.  € (tk,tk+1l.  Let  Qk  = Qx  k and  \ = Ww 

tk  t 

where  uk  = ut  for  tfc  < t < tk+r  Then 

(8.4)  Pk(B-xB")  = J CQk  x Wk)CB")dPk_1(x;,w:). 

This  determines  the  probability  measure  P,  on  9 

K ti.  . -1 


We  take  P = P 


m- 1 * 


Lemma  9.  The  system  (x,u,w)  is  A-admissible  (P.O.). 

Proof.  By  construction,  ut  is  .^-measurable  and  constant 
on  each  interval  of  the  partition  A.  According  to  the  definition 
in  §2  it  suffices  to  verify  that,  for  each  g € <9t  m|  and  m®wt 
are  t -martingales , where  m|  is  defined  by  (2.3).  Let  us 
first  consider  tk  < r < t < tk+1<  Let  9"  be  the  c-algebra 

generated  by  x'',w"  paths  restricted  to  (tk,r]  . Then 

Q.xw. 

Ek  k((m  f-m*)|^} 

0,  tt  u,_ 


Q ft  U, 

E k(g(xt)  - gCxr)  - J Stf  g (xg)ds  | 9"}  « 0 


I 
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Hence,  for  any  B"  € 

Cmf-m®)d(Qk*Wk)  = o, 


I 


B' 


Since  m®  - m®  and  wt  - wf  are  independent  with  respect  to 
Qk  x Wk»  we  also  have 


|B„(n.S-n,S)(„t-„r)d(Qk»W)c)  - 0. 


For  B € 9 of  the  form  B 

r 


B"  € 

r 


B'  x B"  with  B'  £ and 

*k 


/B(n,f-m|)dPk  - |gi  f C-*-«*)d(Qk«WJ[)dPk.1  = 0. 


Thus,  E{m*f|  = m^.  Similarly, 


E{(m®-mjHwt-wr)|  jr}  = 0, 


from  which  E{-m|wt  | = m*>r,  for  tk  < r < t < tk+1.  If 

r < tk  - t - tk+l ’ we  first  condition  on  and  then  on  & . 

This  shows  that  m®  and  m^w^.  are  { j**.} -martingales , from  which 
Lemma  9 follows. 


Lemma  10.  Given  e > 0 there  exists  an  admissible  system  CP-O.) 
(x,u,w)  such  that 


E{tt  C*)  | y?  } < V,C\  ) + eCl  - h 
k ‘ K tk  m 


L 


PIMfct*  - 
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for  k - 0,1,..., m-1  and  with  any  distribution  11 Q for  xn.  Here 


"t  is  a regular  conditional  distribution  for  x given  ,9\  . 
k k k 


Proof.  In  the  construction  above  we  choose  the  partition 
of  ..4  fine  enough  and  Uj.j  such  that 


*(».ukj;Vl,Vi  - t„)  < VlC*)  * = 


m 


for  all  11  € , k ■ 0,1,..., m.  This  is  possible  by  Lemma  7 and 


compactness  of  -4  * U.  The  probability  measure  P = P j and  the 


piecewise-constant  control  process  u on  the  canonical  sample 


space  are  defined  by  the  construction.  We  now  proceed  by  backward 
induction  on  k (compare  with  the  proof  of  Lemma  8).  For  k = m, 
Vm('T)  ■ Proceeding  inductively,  suppose  Lemma  10  true  for 
k ♦ 1.  Let  be  a regular  conditional  distribution  for 


(x,u,w)  given  9^  . With  P-probability  1,  ut  is  a constant  Uj.j 


on  [ t ^ , t k^. i ) and  ”t  is  constant  0^-almost  surely.  Moreover, 
by  (2.5) 


nt(g)  - (g)  + | ns(^  k-*g)ds  + | 

^ 1 1.  t . 


l1,s(gh)  - ns  (g)  "g  (h) ) -dws. 


for  tj.  < t < tk+j.  Hence,  by  the  definition  of  v in  Lemma  7, 


0, 


E Vk+ll"tk+1)  " ^(ntk»ukj ;Vk*lftk+l  ' V* 


We  then  have,  with  P-probability  1, 


°k 


E{Vl(\  5 I 9X  } * K Vk^l(\  ) 5 V\  > * i 

* 1 *k*l  rk  K 1 tk^l  k *k  m 


L 


E{"tO)|  } « E{E{tt  (*)|  9 } 9 } 

1 rk  1 tk+l  tk 

* E{W\k+1>l  V + £C1 
5 V\>  * ‘C1  • i>- 


This  proves  Lemma  10. 

We  get  the  inequality  (8.2)  needed  to  prove  Theorem  2 by 
taking  k = 0 in  Lemma  10,  since  3*q  is  the  trivial  o-algebra 
and  En,j,(<l>)  = E4>(x^,). 


9.  A Sufficient  Condition  for  a = a . 

s 

According  to  (3.4),  < a;  while  by  Theorem  2,  . 

Since  the  class  of  A-admissible  systems  is  contained  in  the  class 

of  admissible  systems  (either  (P.0.)  or  (S)),  we  also  have  _>  a 

and  > a . Therefore,  we  will  have  a = a if  we  can  show  that 
s - s s 


(9.1) 


a = inf  a 


Unfortunately,  we  have  verified  (9.1)  only  under  the  rather  restrictive 
assumptions  of  Theorem  3 below. 

The  proof  of  Theorem  3 will  proceed  as  follows.  Given  any 

admissible  system  (S)  (n,u,b),  approximations  (tfn,un,b)  are  made 

t . n . . r* 

such  that  ut  is  piecewise  constant.  It  is  shown  that  n is  near 

v fp  L 


in  a suitable  sense,  if  E 


‘dt  is  small.  See  (9.5), 


However,  our  proof  of  this  estimate  uses  boundedness  of  the  generators 
. To  simplify  matters  we  also  assume  that  the  control  enters 


• • 
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— 


where  j * 0,1,2,.. 


Since  y°,yl  are 
some  M 


. Let 


aki (t^ 

■ max 

»t(f) 

t 

u 

Ykj 

“ max 

f€A 

llfll2 

‘j 

1 |u-un 

1 1 2 * U 

fT  , 

iu  -ujrat 
0 1 1 

bounded  operators  and  ||h|| 


Vi,j5Mvkj,  y0j  < c2. 


? V 


Hence,  Y..  < C M . In  (9.2)  wc  take  g € * . . 


kj 

bounded  operator  wc  have  for  some  K 


kj 


1 


CJ.3)  Bkj(t)  < . ekJC»)  . Ykj 


Take  P > M,  and  let 


(9.4) 


MO  - l p'\,(t) 

J k-o  KJ 

(O 

act)  - i 2'Je.(t). 


J-o 


Jj 


< 00 , we  have  for 


Since  is  a 


bQ0(s)]ds  + Ykj  | | u-u 


From  (9.3) 


A 


Li 
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3.(t)  < K.  ( f (P+1)  e.  (s)  ♦ [%nnCs)C  I p'kC2Mk)ds 
j ' L h J J0  UU  k=0 


+ c l p'kC2Mk)| |u-un| |2. 
k=0  L 


Since  3Q0  < 3,  we  then  have  for  some  K2 


(9.5) 


B(t)  < K2(jo3(s)ds  + | |u-un|  | 2]  , 


3(t)  < exp(K2T) | |u-un| | 2,  0 < t < T 


To  complete  the  proof  of  Theorem  3,  given  (n,u,b)  admissible  CS) 
n 


let  u , n = 1,2,...,  be  a sequence  of  piecewise-constant  controls 
such  that  ||u-un||2  0 as  n -+  ».  The  control  u^  is  constant 
on  intervals  ^k’^k+l^  some  partition  An  of  [0,T],  and  u^ 
is  & n measurable  on  Since  and  are 


bounded  operators,  the  technique  of  successive  approximations 

,n 


provides  a solution  n*  to  (4.2)  corresponding  to  the  initial  data 
and  ut»^t  ‘ the  Pro°^  (one  proof  uses  a method  like 

the  one  above,  with  t the  difference  between  successive  approxima- 


tions to  the  solution  n”) . Inequality  (9.5)  implies  that 


En£ (g j ) - E7,t(gj)  as  n •*  -.  Since  g(),g1,g2,...  span  C(E), 


E*”(g)  E*  (g)  for  all  g € C(E),  0 < t < T.  In  particular, 
z z a 

E"!j!(<J>)  -*•  ETTT C«*») . Since  «s  < ugn  < En|J(<i>), 


a < lim  sup  a n < E^T(4>) 
n-*-® 


Since  the  infimum  of  the  right  side  among  all  admissible  (fi,u,b) 


is  this  proves  Theorem  3. 


8 


27 


REFERENCES 


[1]  P.  Billingsley,  Convergence  of  Probability  Measures,  Wiley, 

New  York,  1968. 

[2)  J-M.  Bismut,  Sur  un  problfcmc  de  contrdlc  stochastique  avec 
observation  partiellc,  Z.  Wahrscheinlichkeitstheorie  Vcrw. 

Gebiete . 

[31  M.H.A.  Davis,  Nonlinear  semigroups  in  the  control  of  partially- 
observable  stochastic  systems,  in  Measure  Theory  and  Applica- 
tions to  Stochastic  Analysis,  Ed.  G"!  Kallianpur  and  D.  K'Alzow, 
Springcr-Verlag  Lecture  Notes,  in  Math.,  No. 

[4]  W.H.  Fleming  and  M.  Nisio,  On  the  existence  of  optimal 
stochastic  controls,  J.  Math,  and  Mech.  1_5  ( 1 9t>6) , 777-794. 

[5]  W.H.  Fleming  and  M.  Viot,  Some  measure- valued  Markov  processes 
in  population  genetics  theory,  Indiana  J.  Math,  (to  appear). 

[6]  II.  Kunita,  Asymptotic  behavior  of  nonlinear  filtering  errors  of 
Markov  processes,  J.  Multivariate  Analysis  1_(1971),  365-393  . 

[7J  R.S.  Liptser  and  A.N.  Shiryaycv,  Statistics  of  Random  Processes  I, 
Springer-Verlag , 1977  (tr.  from  Russian) 

[8]  P.A.  Meyer,  Seminairc  de  Probabilites  I,  Springer  Lecture  Notes 
in  Math.,  No.  39,  1967. 

[9]  P.  Priouret,  Ecole  d'Ete  de  Probabilites de  Saint-Flour  II, 
Springer-Verlag  Lecture  Notes  in  Math.,  No.  390,  1974. 

[10]  A.  Segall,  Optimal  control  of  noisy  finte-state  Markov  processes, 
IEEE  Trans,  on  Auto.  Control,  Apr.  1977,  179-186. 

[11]  J.  Szpirglas,  Sur  1 ' equivalence  d'equations  differentielles 
stochastiques  a valeurs  mesures  intervenant  dans lc  filtrage 
markovien  nonlineairc,  Ann.  Inst.  Henri  Poincare,  Sec.  B. 
14(1978),  33-59. 


[12]  A.  Bensoussan  and  J.L.  Lions,  Applications  des  inequations 
variationncl les  cn  contrCle  stochastique,  Methodes  Math,  de 
1 ' Informatique , No.  6,  Dunod,  1978. 


